Factored Models for Deep Machine Translation

نویسندگان

  • Kiril Simov
  • Iliana Simova
  • Velislava Todorova
  • Petya Osenova
چکیده

In this paper, we present some preliminary results on Statistical Machine Translation from Bulgarian-to-English and English-to-Bulgarian. Linguistic knowledge has been added gradually as factors in the MOSES system. The tests were performed on the QTLeap corpus data in IT domain for Pilot 1. The training was done on news parallel data as well as on IT domain data. The BLEU scores show that the addition of linguistic knowledge improves the Machine Translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factor templates for factored machine translation models

In this paper, we present a method of avoiding the combinatorial explosion encountered in Factored Models during the construction of translation options caused by the large number of possible combinations of target language lemmas and morpho-syntactic factors. We automatically extract factor templates from a word-aligned annotated bilingual corpus and use them to distinguish which morpho-syntac...

متن کامل

Phrase-Based and Deep Syntactic English-to-Czech Statistical Machine Translation

This paper describes our two contributions to WMT08 shared task: factored phrase-based model using Moses and a probabilistic treetransfer model at a deep syntactic layer.

متن کامل

Feature Selection for Factored Phrase-Based Machine Translation

In the presented work we investigate factored models for machine translation. We provide a thorough theoretical description of this machine translation paradigm. We describe a method for evaluating the complexity of factored models and verify its usefulness in practice. We present a software tool for automatic creation of machine translation experiments and search in the space of possible confi...

متن کامل

Factored Translation between Brazilian Portuguese and English

Factored translation is an extension of the state-of-theart phrase-based statistical machine translation (PB-SMT). The main difference in factored translation approach is that a word is not only a token (its surface form) but a vector composed of different information such as lemma, part-of-speech or morphologic/syntactic tags. In this paper we present some experiments carried out to train and ...

متن کامل

Statistical Translation Models: A Literature Survey

In this survey, we briefly study Phrase-based, Factored and Hierarchical translation models. First we learn basics of Phrase-based model. Then we get introduced to an interesting SMT approach called Factored translation models. We also study mathematical modeling of the Factored models. Finally, we compare Factored models with Phrase-based models and know their disadvantages which are pulling t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015